Preparation contextuality powers parity-oblivious multiplexing 
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In a noncontextual hidden variable model of quantum theory, hidden variables determine the 
outcomes of every measurement in a manner that is independent of how the measurement is im- 
plemented. Using a generalization of this notion to arbitrary operational theories and to prepara- 
tion procedures, we demonstrate that a particular two-party information-processing task, "parity- 
oblivious multiplexing," is powered by contextuality in the sense that there is a limit to how well any 
theory described by a noncontextual hidden variable model can perform. This bound constitutes 
a "noncontextuality inequality" that is violated by quantum theory. We report an experimental 
violation of this inequality in good agreement with the quantum predictions. The experimental 
results also provide the first demonstration of 2-to-l and 3-to-l quantum random access codes. 
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The Bell-Kochen-Specker theorem [l[ shows that the 
predictions of quantum theory arc inconsistent with a 
hidden variable model having the following feature: if 
A, B and C are Hermitian operators such that A and 
B commute, A and C commute, but B and C do not 
commute, then the value predicted to occur in a mea- 
surement of A does not depend on whether B or C was 
measured simultaneously. This feature is called "noncon- 
textuality." Significantly, it is only well-defined for mod- 
els of quantum theory (and then only for projective mea- 
surements and deterministic models) 0]. By contrast, 
Bell's definition of a local model applies to any theory 
that can be described operationally Consequently, 
whereas one can test whether or not experimental statis- 
tics are consistent with a local model (by testing whether 
or not they satisfy Bell inequalities), there is no way to 
test whether or not experimental statistics are consistent 
with a noncontextual model (and no way of defining as- 
sociated "noncontextuality inequalities" ) unless one gen- 
eralizes the traditional notion of noncontextuality in such 
a way that it makes no reference to the quantum formal- 
ism. Suggestions for such a formulation have been made 
by several authors Q ■ A particularly natural generaliza- 
tion (and slight modification) which applies to all models 
(deterministic or not) of any operational theory has been 
proposed in Ref. Q. We here derive a noncontextuality 
(NC) inequality based on this notion. 

Because information-theoretic tasks can be character- 
ized entirely in terms of experimental statistics, one can 
explore whether theories that violate NC inequalities may 
provide information-theoretic advantages over theories 
that satisfy these inequalities. We prove that this is in- 
deed the case for a task which we call parity- oblivious 
multiplexing, a kind of two-party secure computation. 
(The notion that contextuality might yield an advantage 
for multiplexing tasks was first put forward by Galvao 
0.) The NC inequality we derive provides a bound on 



the probability of success in this task and wc demonstrate 
a quantum protocol for parity-oblivious multiplexing for 
which the probability of success exceeds the noncontex- 
tual bound. 

Finally, we report an experimental implementation of 
this protocol that achieves a probability of success in 
good agreement with the quantum result and in viola- 
tion of the NC inequality. 

Operational theories and noncontextual mod- 
els. In an operational theory, the primitives of descrip- 
tion are preparations and measurements, specified as in- 
structions for what to do in the laboratory. The theory 
simply provides an algorithm for calculating the probabil- 
ity p{k\P, M) of an outcome k of measurement M given 
a preparation P. As an example, in quantum theory, ev- 
ery preparation P is represented by a density operator 
pp, every measurement M is represented by a positive 
operator valued measure {-EW.fc}, and the probability of 
outcome k is given by p(k\P, M) = Tr (pp-Ejv/.fc) ■ 

In a hidden variable model of an operational theory, 
a preparation procedure is assumed to prepare a sys- 
tem with certain properties and a measurement proce- 
dure is assumed to reveal something about those prop- 
erties. The set of all variables describing the system 
is denoted A. It is presumed that for every preparation 
P, there is a probability distribution p(X\P) such that 
implementing P causes the system to be prepared in 
physical state A with probability p(X\P). Similarly, it 
is presumed that for every measurement M, there is a 
distribution p(k\X,M) such that implementing M on a 
system described by A yields outcome k with probability 
p(k\X, M). For the hidden variable model to reproduce 
the predictions of the operational theory, it must satisfy 
p(k\P,M) = J dXp(k\X,M)p(X\P). 

A hidden variable model is preparation noncontextual 
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if the following implication holds 

VM : p(k\P, M) = p(k\P', M) -> P {X\P) =p(X\P'), 

(1) 

that is, if two preparations yield the same statistics for all 
possible measurements then they are represented equiva- 
lently in the hidden variable model. Similarly, measure- 
ment noncontextuality is the condition that 

VP : p(k\P, M) = p{k\P, M') -> p(k\X, M) = p{k\X, M'), 

. {2) 

that is, if two measurements have the same statistics 
for all possible preparations then they are represented 
cquivalently in the model. More details can be found 
in Ref. An NC inequality is any inequality on ex- 

perimental statistics that follows from the assumption 
that there exists a hidden variable model that is prepa- 
ration and measurement noncontextual. It is of the form 
f{p{k\Pi,Mi),p{j\P 2 ,M 2 ),...) < C for some function / 
and constant C. 

Parity-oblivious multiplexing. Suppose that Al- 
ice and Bob wish to perform the following information- 
processing task, which we call n-bit parity- oblivious mul- 
tiplexing. Alice has as input an n-bit string x chosen 
uniformly at random from {0, 1}". Bob has as input an 
integer y chosen uniformly at random from {1, . . . , n} and 
must output the bit b = x y , that is, the yth bit of Alice's 
input. Alice can send a system to Bob encoding informa- 
tion about her input, however there is a cryptographic 
constraint: no information about any parity of x can be 
transmitted to Bob. More specifically, letting s 6 Par 
where Par = {r\r e {0, 1}™, J2i r i ^ 2} is the set of n-bit 
strings with at least two bits that are 1, no information 
about x- s = XiSi (termed the s-parity) for any such s 
can be transmitted to Bob (here © denotes sum modulo 
2). This task is similar to an n-to-1 quantum random 
access code [!, @, 0, HJ except that it has a constraint of 
parity-obliviousness rather than a constraint on the po- 
tential information-carrying capacity of the system used. 

Lemma 1. Classically, the optimal probability of suc- 
cess in n-bit parity- oblivious multiplexing satisfies p(b = 
x y ) < (n + l)/2n. 

Proof. (For details, see Appendix A.) The only classical 
encodings of x that reveal no information about any par- 
ity (while encoding some information about x) are those 
that encode only a single bit x% for some i. Given that 
y is uniformly distributed, it makes no difference which 
bit it is. Therefore, we may assume that Alice and Bob 
agree that Alice will always encode the first bit, x\. If 
y = 1, which occurs with probability 1/n, then Bob can 
output b = x y and win. With probability (n — l)/n, we 
have y ^ 1 and in this case Bob can at best guess the 
value of x y and wins with probability 1/2. ■ 

What is the most general protocol that can be imple- 
mented in an arbitrary operational theory? For each in- 
put string x, Alice implements a preparation procedure 



P x , and for each integer y, Bob implements a binary- 
outcome measurement M y , and reports the outcome b as 
his output. The probability of winning is 

P0> = x v ) = ^ E P(b = x y \P x ,My) 

y£{l,...,n}x£{0,l} n 

(3) 

where 1 /2 n n is the prior probability for a particular x and 
y. The parity-oblivious constraint requires that for ev- 
ery s-parity, there is no outcome of any measurement for 
which posterior probabilities for s-parity and s-parity 
1 are different, that is, 

VsVA/Vfc: P(Px\k,M) = ^ p(P x \k,M). (4) 

x\x-s— x\x-s— 1 

Noncontextuality inequality. The main theoretical 
result of this letter is the following theorem. 

Theorem 2. In an operational theory that admits a 
preparation noncontextual hidden variable model, the op- 
timal probability of success in n-bit parity- oblivious mul- 
tiplexing satisfies p(b = x y ) < (n + l)/2n. 

Proof. Define P s ^ to be the procedure obtained by 
choosing uniformly at random an x such that x-s = b and 
implementing P x . Clearly, for any measurement M, the 
probability of outcome k given preparation P s .b is simply 

p(k\P s , b ,M) = ^ ZI J2 P( k \ p ^ M )- ( 5 ) 

x\x-s— b 

Similarly, the probability of hidden variable A given an 
implementation of P St b is simply 

P (\\p atb ) = J2 p(MPx)- (6) 

x\x-s— b 

Now note that one can rc-cxpress the parity-oblivious 
condition, Eq. ©, as VsVA/ : T, x \ x . s = P( k \ P ^ M ) = 
Tlx\x-s=i P(k\Px, M) (h follows from Baycs' rule and 
the uniformity of the prior over x). Combining this 
with Eq. ©, we infer that VsVA/ : p(k\P sfi ,M) = 
p(k\P Si i,M) which is simply the statement that mixed 
preparations corresponding to opposite s-parities are in- 
distinguishable by any measurement. But together 
with the assumption that the hidden variable model is 
preparation noncontextual, Eq. ([T]), this implies that 
Vs : p(X\P Si a) = p(A|P s i), which states that mixed prepa- 
rations corresponding to opposite s-parities are also in- 
distinguishable at the hidden variable level. Using Eq. ([6]) 
and Baycs' rule again, we obtain 

Vs: Yl P(P*W= E ^l A )' ( ? ) 

x\x-s— x\x-s— 1 

Therefore, even if one knew A, the posterior probabilities 
for s-parity and s-parity 1 would be the same, that is, 
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FIG. 1: Bloch representation of states and measurements in 
quantum 2-bit and 3-bit parity-oblivious multiplexing. 



one would know nothing about any s-parity of x. The 
argument so far can be summarized as follows: for prepa- 
ration noncontextual models, parity-obliviousness at the 
operational level implies parity-obliviousness at the level 
of the hidden variables. 

The hidden state A provides a classical encoding of x. 
But, as just shown, it is one that cannot contain infor- 
mation about any s-parity. We recall from lemma [T] that 
such encodings have information about at most one bit, 
Xj, of x. Consequently, even if Bob could determine A 
perfectly, he and Alice could at best achieve the optimal 
probability of success achievable in a classical protocol 
(specified in lemma [1]), while if Bob is limited in his abil- 
ity to determine A (as will be the case in general in a 
hidden variable model), they will do worse. ■ 

Quantum case. We now consider how well one can 
achieve parity-oblivious multiplexing in quantum theory. 
The following is a protocol for the 2-bit case that uses 
a single qubit as the quantum message. Alice encodes 
her 2 bits into the four pure quantum states with Bloch 
vectors (±-^j, i"^) equally distributed on an equatorial 
plane of the Bloch sphere, as indicated in Fig. [1] (recall 
that a density operator p is related to its Bloch vector r 
by p = i(J + r ■ (?), where a is the vector of Pauli matri- 
ces) . Bob measures along the x axis if he wishes to learn 
the first bit, and along the y axis if he wishes to learn 
the second. He guesses the bit value upon obtaining the 
positive outcome. In all cases, the guessed value is cor- 
rect with probability cos 2 (7r/8) ~ 0.853553. Meanwhile, 
no information about the parity can be obtained by any 
quantum measurement given that the parity and parity 
1 mixtures are represented by the same density operator, 
\ Poo + \p\\ = \pm + \pw = 1/2- We have a violation of 
the NC inequality of Thm.[5]because for n = 2, the upper 
bound on the probability of success is 3/4. By exploit- 
ing a connection with the Clauser-Horne-Shimony-Holt 
inequality Q , one can show that this protocol yields the 
maximum possible quantum violation of the NC inequal- 
ity. 

A protocol for 3-bit parity-oblivious multiplexing using 
a single qubit proceeds as follows. Alice encodes her three 
bits into a set of eight pure quantum states associated 



with Bloch vectors (±^, ±^j, ±^j) forming a cube in- 
side the Bloch sphere (see Fig. [IJ. Bob measures along 
the x, y or z axes to obtain the first, second or third bits. 
In all cases, the guessed value is correct with probability 
|(1 + -^) ~ 0.788675. The mixture of the four states cor- 
responding to x\ ®x 2 = (i.e. s-parity for s = (1, 1, 0)) 
is identical to the mixture of the four states correspond- 
ing to xi © xi = 1 and is equal to 1/2. Similarly for 
the two mixtures associated with each of the other three 
parities, x x © x 3 (s = (1, 0, 1)), x 2 © x 3 (s = (0, 1, 1)), 
and X\ © X2 © x 3 (s = (1, 1, 1)). The protocol is there- 
fore parity-oblivious for all s-paritics. Again we have a 
violation of the NC inequality because for n = 3 the up- 
per bound on the probability of success is 2/3. It is an 
open question whether 0.788675 is the maximum possible 
quantum violation. 

The 2-bit protocol was originally presented as a 2-to-l 
quantum random access code by Wiesner Q and redis- 
covered in Ref . 0] , while the 3-bit protocol was presented 
in Ref. [f| as an instance of a 3-to-l quantum random 
access code (the original idea is attributed to Chuang in 
Ref. 0). 

Experimental results. We experimentally demon- 
strate better-than-classical performance for 2-bit and 3- 
bit parity-oblivious multiplexing by implementing the 
quantum protocols using polarization qubits. Photon 
pairs from downconversion are coupled into single mode 
optical fibers. One photon acts as a trigger, while the 
other is used in the experiment. Alice's state prepara- 
tion consists of a fiber polarization controller, and a po- 
larizing beam displacer, rotated to the input state angle, 
used to ensure high-purity linearly polarized states for 
the 2-bit protocol. An additional quarter wave plate is 
used to prepare elliptically-polarized states for the 3-bit 
protocol. Bob's measurement consists of a polarizing 
beam displacer mounted in a computer-controlled rota- 
tion mount, followed by a single photon counting module. 
For our demonstration, a detector is placed at only a sin- 
gle output port of the beam displacer and the probability 
of each outcome is calculated from the relative number 
of counts for a given beam displacer angle and the one 
orthogonal to it. (Further details of the experimental 
set-up, including a figure, are provided in Appendix B.) 
Adjustment of the beam displacer and quarter wave plate 
angles allows measurement of the horizontal/ vertical ba- 
sis, the diagonal/anti-diagonal basis and the right/left- 
circular basis. Valid measurement events are heralded by 
a coincidence count between the directly detected photon 
and the experiment photon. These experimental proce- 
dures for a given x and y define the preparation P x and 
the measurement M y respectively. 

We obtained probabilities p(k = x y \P x , M y ) by accu- 
mulating statistics over approximately 3.5 x 10 7 coinci- 
dence counts for each x and y in the 2-bit scheme and 
2.4 x 10 7 in the 3-bit scheme. Using Eq. ([3]), we cal- 
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culated the 2-bit and 3-bit probabilities of success to 
be p(b = x y ) = 0.851929 ± 0.000030 and p(b = x y ) = 
0.786476 ± 0.000017 respectively. The errors were de- 
termined from the Poissonian counting statistics of the 
parametric source and the small repeatability error in the 
wave plate settings, using standard error analysis tech- 
niques. These probabilities of success violate the NC in- 
equality of Thm. [2]with a high degree of confidence: 3410 
and 6922 standard deviations respectively. They are also 
close to the predicted quantum values of 0.853553 and 
0.788675, achieving a violation that is 98.4% and 98.2% 
respectively of the gap between the NC bound and the 
quantum value. 

Just as Bell inequality violations are only surprising 
given the absence of signalling between the two wings 
of the experiment, the NC inequality violations are only 
surprising given the parity-oblivious property. However, 
whereas one can establish the absence of signalling by 
confirming that the two wings are space-like separated, 
one must directly test for transmission of information 
about the parity in our experiment. A consideration of 
how this is to be accomplished highlights two shortcom- 
ings in the operational definition of preparation noncon- 
tcxtuality of Eq. ([1]): in practice one can never imple- 
ment all measurements and one never finds truly identi- 
cal statistics. The first issue may be addressed by rely- 
ing on previous experimental evidence for the existence 
of a tomographically complete set of measurements - one 
from which the statistics of any other measurement can 
be calculated - and testing indistinguishability relative 
to this set alone, as we shall do here. The second is- 
sue may be addressed by presuming a kind of continuity: 
closeness of experimental statistics implies closeness of 
the representations in the model Q (this parallels the 
problem of dealing with imperfect alignment in tradi- 
tional proofs of contextuality [IH , where continuity also 
provides an answer 0, EH]). In the present work, we sim- 
ply demonstrate that the experimental statistics are close 
to parity-oblivious while yielding a large violation of the 
noncontextuality inequalities, and leave a more detailed 
analysis for future work. 

We quantify the obliviousness of our experimental pro- 
tocol for a particular s-parity by the maximum proba- 
bility that Bob can correctly estimate this parity in a 
variation over all measurements. One can estimate this 
by implementing a tomographically complete set of mea- 
surements, then reconstructing the states po and p\ as- 
sociated with s-parity and s-parity 1 , and finally mak- 
ing use of the fact that the maximum probability of dis- 
criminating these states is | + jTr|/5o — 9\\- Among all 
s-parities, we calculate the largest such probability to 
be 0.5020 ± 0.0002. This calculation is not sufficient, 
however, because it neglects an imperfection in the ex- 
periment that also contributes to leakage of information 
about the parity, namely, that there is a small probability 
of more than one photon being sent to the experiment. 



By our characterization of the source, we estimate the 
probability of two photons to be 0.007 ± 0.003 relative 
to the single photon generation probability. If two pho- 
tons pass through the polarizers in the ideal protocol, the 
maximum probability of correctly estimating the parity 
can be quite far from 1/2: it is 3/4 in the case of the 
2-bit scheme and 2/3 for three of the four s-parities in 
the 3-bit scheme. However, the fact that this possibility 
occurs with low probability implies that the two-photon 
contribution to the probability of correct estimation is 
comparable to the one-photon contribution. (Contribu- 
tions from three or more photons are negligible in com- 
parison). The weighted average of these contributions is 
easily calculated and the largest, among all s-parities, is 
found to be 0.504 ± 0.002. The fact that this is within 
one percent of 1/2 demonstrates that our experimental 
protocols are indeed close to parity-oblivious. 

Given that the quantum protocols described herein 
are also 2-to-l and 3-to-l random access codes, our re- 
sults constitute the first experimental demonstration of 
a quantum advantage for these tasks as well. 

Finally, it is worth noting that every Bell inequality is 
a special case of an NC inequality where all assumptions 
of noncontextuality are justified by locality Q. Conse- 
quently, every experimental violation of a Bell inequality 
demonstrates the impossibility of a noncontextual hidden 
variable model. Indeed, this is all that can be demon- 
strated by those that fail to seal the locality loophole 
ToL 111 ]. Nonetheless, a dedicated experiment of the sort 
we have described here can achieve a large violation with 
high confidence at a smaller cost of experimental effort. 
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APPENDIX A: OPTIMAL CLASSICAL 
PROTOCOL FOR N-BIT PARITY-OBLIVIOUS 
MULTIPLEXING 

We here provide a more detailed proof of lemma 1. 
First, note that by the assumption of parity- 
obliviousness, the classical message m sent from Alice 
to Bob must satisfy 



Vs 



p(P x \m)= V p(P x \m) (8) 



x\x-s— 



u\x- s— 1 



By Bayes' theorem and the fact that the distribution over 
inputs x is uniform, we can rewrite this as a constraint 
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on p(m\P x ), namely, 

Vs: Yl P( m \Px)= J2 P( m \ p *)- ( 9 ) 

x\x-s— x\x-s— 1 

As we will demonstrate (at the end of this section), this 
implies that p(m\P x ) has the form 

p(m\P x ) = p(0)p (m) 

n 

+ Y]p(i) \pi,a{m)5 Xi fl + Pi,i(m)S Xi ,i] , (10) 

where is a normalized probability distribution on 
{0, . . . , n}, the functions po(m),pi t o(m) and Pi.i( m ) ar( 3 
normalized probability distributions over m, and where 
d a .b is the Kroneckcr delta function (equal to 1 if a = b 
and otherwise). 

It follows that any classical parity-oblivious multiplex- 
ing protocol can be interpreted as follows: Alice gener- 
ates an integer i G {0, . . . , n} from the distribution p(i). 
Upon obtaining i = 0, she sends a message m chosen 
from the distribution po(m) (independent of the value of 
x). Upon obtaining i G {1, . . . , n}, she sends a message 
m chosen from one of two distributions, depending on 
the value of the ith bit of x : the distribution is Pi o(m) 
if Xi = and p^i(m) if x% = 1. 

We now determine the choice of these distributions 
that leads to a maximum probability of winning. First 
note that if i = 0, Bob gets no information about x. This 
is clearly not optimal, so we may set p(0) = 0. Next note 
that the amount that Bob learns about Xi depends on his 
ability to distinguish pi${rri) from pi^(m). To optimize 
the amount that Bob can learn, pi,o(m) and must 
be chosen to be perfectly distinguishable. This is only 
possible if they are completely non-overlapping, that is, 
if Pi,o(m)pis(m) = 0. 

In an optimal decoding, Bob simply determines 
whether m is in the support of p yt o(m) or of p yj i(m) 
and outputs b = or 1 accordingly. This is optimal for 
the following reason. The message m only contains infor- 
mation about x y if Alice happened to generate an i that 
coincides with y and in this case Bob will output b = x y 
with probability 1. When i does not coincide with y, Bob 
gets no information about x y from m, so it is irrelevant 
what he outputs; given that x y is equally likely to be or 
1 , his probability of having generated the correct output 
will be 1/2. 

Finally, given that y is chosen uniformly at random, 
the probability of i coinciding with y is 1/n, so that 
the overall probability of a correct output is — (1) + 
(l-i)(i) = (n+l)/2n. 

It is worth noting that there are many natural schemes 
that achieve the optimum: 

• If p(i) = 8i.j for some particular j G {1, . . . , n}, and 
Pj t o(m)pj^i(m) = 0, then Alice has simply encoded 
the value of Xj in her message. 



• For p(i) an arbitrary distribution over {1, . . . ,n}, 
if Pi,o( m )Pi,i i m ) = for all i, then Alice has sim- 
ply chosen a value i e {1, . . . , n} according to this 
distribution and encoded xi in her message. 

• For p(i) an arbitrary distribution over {1, . . . ,n}, 
if Pi t b(m)pi' t b' (m) = when either b =/= b' or i ^ 
i' , then Alice has encoded both i and Xi in her 
message. 

It remains to prove our claim that the parity-oblivious 
constraint, Eq. ([9]), implies the decomposition of p(m\P x ) 
described in Eq. (TIT))) . We do this using Fourier analysis 
over Z§. Let r E {0, 1}™. Define functions Xr ■ {0, 1}" -» 
[—1,1] where 

Xr(x) := (-iy r . 
These form an orthonormal set because 

Y J Xr{x) X r'{x)= Y, (-ir (rffir,) =2"<V. 
x ^€{0,1}" 

Moreover, noting that the dimensionality of the space of 
functions on {0, 1}™ is 2™ (a parameter for every input 
string) and that there are 2 n values of r, we see that the 
Xr form an orthonormal basis of the function space. It 
follows that we can write p(m\P x ) in the Fourier series 

p(m\P x ) = YP( m > r )Xr(x)- 

r 

We infer that 

2 n p(m,r) =Yxr(x)p(m\P x ) 

X 

= y p( m \Px)- Y p( m \Px)- 

x\x-r—0 x\x-r=l 

Combining this with the parity-obliviousness condition, 
Eq. ^j, one obtains 

Vs G Par : p(m, s) = 0. 

Consequently, the only strings r for which p(m, r) ^ 
are those with Hamming weight or 1. Denoting the 
Fourier coefficients of the all zero string by po( m ) and 
that of the string with a single 1 at position i by pi(m), 
we have 

n 

p{m\P x ) =p (m) + Ypi( m )(- l ) Xi ■ 

i=l 

Because (— l) Xi = 5 Xit o — 5 Xi ^ and 1 = 6 Xi fi + 8 Xi .\, we 
can write 

n 

p{m\P x ) = a (m) + Y [^(mO^o + ai,i(m)S Xit i] 

i=l 

(11) 
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where we have defined nonncgative coefficients 

ai,o(m) = 2pi(m), ais{m) = if sgn(j>i(m)) > 0, 
o»,o(w) = 0, a;,i(m) = —2pi(m) if sgn(pi(m)) < 0; 

and we have implicitly defined a constant arj(m), which 
we presently show is also nonnegative. To do this, we 
define an rt-bit string z(m) that encodes the signs of the 
Fourier coefficients. Specifically, z(m) is defined by 



Zi(m) 



1 if sgn(pi(m)) > 0, 



if sgn(pi(m)) < 0. 
It follows from this definition that 

ai,o{ m ) S Zi{m),0 + 0»,l( m )^i(m),l = 

for all i, and consequently that 

p(m\P z ( m )) = a Q (m), 

which establishes that ao(m) > 0. 

Finally, we show that Eq. (fTTj) can be put into the form 
of Eq. ([fUjl . By the normalization of the distribution 
p(m\P x ), we have 

n 

m m i— 1 m 

for all x. Defining Ao = X)m a o( TO ) an d -<4-i,xj = 
^2m a i^i( m )' we have 

n 

A + ^A ltXi = 1, 

i=l 

for all x, which implies that X)"=i Ai, Xi is independent of 
x and in particular of xi. We deduce that 

for all i. Eq. (JTOj) now follows from Eq. (fTTj) by identifying 

p(0) = A 

p(i) = A i>0 = A h i 



and 



Po(m) = a o (m)/p(0) 
Pi,b{m) = a lib (m)/p(i). 



ifp(0),p(i)^0. 



APPENDIX B: EXPERIMENTAL DETAILS 

A schematic of the experimental set-up is provided in 
Fig. [21 We used type-I downconversion in bismuth borate 
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FIG. 2: Experimental set-up. The parametric downconver- 
sion source provides single photons to the experiment. Detec- 
tion events in the experiment are counted in coincidence with 
the downconversion trigger photon (not shown). 



(BiBO) to generate pairs of 820 nm, horizontally polar- 
ized single photons from a 410 nm, 60 mW continuous- 
wave diode laser. A 10 nm FWHM interference filter 
is used to reject background light. In the experiment, 
we obtained coincidence rates (2.5 ns window) of ap- 
proximately 23100 pairs/s in the 2-bit scheme and 15200 
pairs/s in the 3-bit scheme. 

Although we chose to implement the experiment with 
a heralded mode of a downconversion source, similar re- 
sults could also have been obtained with weak coherent 
states (using the same measurements) [IBj]. In both cases, 
one must postselect on not finding the vacuum (implying, 
incidentally, that the detector loophole is not sealed [14|). 
and in both cases there is a small amplitude for more 
than one photon and hence a small amount of leaked 
parity information. Our choice was motivated by differ- 
ences in ideal performance - it is only for the downcon- 
version scheme that the leakage of parity information can 
be eliminated in principle (through the use of true sin- 
gle photons heralded by efficient number-resolving detec- 
tors). Nonetheless, this ideal has not yet been realized. 
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